Overview

Dataset statistics

Number of variables28
Number of observations1383
Missing cells936
Missing cells (%)2.4%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory302.7 KiB
Average record size in memory224.1 B

Variable types

NUM23
CAT3
UNSUPPORTED2

Warnings

검진 시 연령 is highly correlated with 생년High correlation
생년 is highly correlated with 검진 시 연령High correlation
총콜레스테롤 has 223 (16.1%) missing values Missing
HDL has 225 (16.3%) missing values Missing
중성지방 has 225 (16.3%) missing values Missing
LDL has 247 (17.9%) missing values Missing
HDL is highly skewed (γ1 = 32.32320688) Skewed
Unnamed: 0 has unique values Unique
요단백 is an unsupported type, check if it needs cleaning or further analysis Unsupported
폐결핵흉부질환 is an unsupported type, check if it needs cleaning or further analysis Unsupported

Reproduction

Analysis started2020-11-02 02:21:35.237379
Analysis finished2020-11-02 02:22:59.039124
Duration1 minute and 23.8 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

Unnamed: 0
Real number (ℝ≥0)

UNIQUE

Distinct1383
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean691
Minimum0
Maximum1382
Zeros1
Zeros (%)0.1%
Memory size10.8 KiB

Quantile statistics

Minimum0
5-th percentile69.1
Q1345.5
median691
Q31036.5
95-th percentile1312.9
Maximum1382
Range1382
Interquartile range (IQR)691

Descriptive statistics

Standard deviation399.3820226
Coefficient of variation (CV)0.5779768779
Kurtosis-1.2
Mean691
Median Absolute Deviation (MAD)346
Skewness0
Sum955653
Variance159506
MonotocityStrictly increasing
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
138210.1%
 
46310.1%
 
45510.1%
 
45610.1%
 
45710.1%
 
45810.1%
 
45910.1%
 
46010.1%
 
46110.1%
 
46210.1%
 
Other values (1373)137399.3%
 
ValueCountFrequency (%) 
010.1%
 
110.1%
 
210.1%
 
310.1%
 
410.1%
 
ValueCountFrequency (%) 
138210.1%
 
138110.1%
 
138010.1%
 
137910.1%
 
137810.1%
 

성별
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size10.8 KiB
M
753 
F
630 
ValueCountFrequency (%) 
M75354.4%
 
F63045.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

생년
Real number (ℝ≥0)

HIGH CORRELATION

Distinct34
Distinct (%)2.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1972.057845
Minimum1954
Maximum1990
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum1954
5-th percentile1958
Q11967
median1971
Q31978
95-th percentile1987
Maximum1990
Range36
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.187446374
Coefficient of variation (CV)0.00415172729
Kurtosis-0.4208627793
Mean1972.057845
Median Absolute Deviation (MAD)6
Skewness-0.01778567806
Sum2727356
Variance67.03427813
MonotocityNot monotonic
Histogram with fixed size bins (bins=34)
ValueCountFrequency (%) 
19701309.4%
 
19711208.7%
 
1980846.1%
 
1977825.9%
 
1972805.8%
 
1967705.1%
 
1978624.5%
 
1964604.3%
 
1973503.6%
 
1961503.6%
 
Other values (24)59543.0%
 
ValueCountFrequency (%) 
1954201.4%
 
1956201.4%
 
1957100.7%
 
1958423.0%
 
1959302.2%
 
ValueCountFrequency (%) 
1990201.4%
 
1989100.7%
 
1988201.4%
 
1987423.0%
 
1985100.7%
 

검진년도
Real number (ℝ≥0)

Distinct13
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2014.554591
Minimum2008
Maximum2020
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum2008
5-th percentile2010
Q12012
median2015
Q32017
95-th percentile2019
Maximum2020
Range12
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.008866341
Coefficient of variation (CV)0.001493564063
Kurtosis-1.076421414
Mean2014.554591
Median Absolute Deviation (MAD)3
Skewness-0.01147695336
Sum2786129
Variance9.05327666
MonotocityNot monotonic
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
20181379.9%
 
20171379.9%
 
20151379.9%
 
20121379.9%
 
20111379.9%
 
20161369.8%
 
20141369.8%
 
20131369.8%
 
20191077.7%
 
2010966.9%
 
Other values (3)876.3%
 
ValueCountFrequency (%) 
200820.1%
 
2009392.8%
 
2010966.9%
 
20111379.9%
 
20121379.9%
 
ValueCountFrequency (%) 
2020463.3%
 
20191077.7%
 
20181379.9%
 
20171379.9%
 
20161369.8%
 

검진 시 연령
Real number (ℝ≥0)

HIGH CORRELATION

Distinct46
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.4967462
Minimum21
Maximum66
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum21
5-th percentile29
Q138
median44
Q349
95-th percentile58
Maximum66
Range45
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.712057921
Coefficient of variation (CV)0.2002921754
Kurtosis-0.3381862541
Mean43.4967462
Median Absolute Deviation (MAD)6
Skewness-0.02212909723
Sum60156
Variance75.89995323
MonotocityNot monotonic
Histogram with fixed size bins (bins=46)
ValueCountFrequency (%) 
43664.8%
 
47634.6%
 
46634.6%
 
45614.4%
 
40604.3%
 
42604.3%
 
44594.3%
 
48584.2%
 
41574.1%
 
49564.0%
 
Other values (36)78056.4%
 
ValueCountFrequency (%) 
2120.1%
 
2230.2%
 
2370.5%
 
2490.7%
 
2590.7%
 
ValueCountFrequency (%) 
6620.1%
 
6540.3%
 
6440.3%
 
6350.4%
 
6290.7%
 


Real number (ℝ≥0)

Distinct196
Distinct (%)14.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean165.9401302
Minimum147
Maximum184
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum147
5-th percentile150
Q1160
median166
Q3172
95-th percentile180
Maximum184
Range37
Interquartile range (IQR)12

Descriptive statistics

Standard deviation8.669249796
Coefficient of variation (CV)0.0522432385
Kurtosis-0.6230548333
Mean165.9401302
Median Absolute Deviation (MAD)6
Skewness-0.1109052218
Sum229495.2
Variance75.15589202
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
171614.4%
 
162604.3%
 
163543.9%
 
169513.7%
 
170503.6%
 
164483.5%
 
160473.4%
 
168443.2%
 
167433.1%
 
161392.8%
 
Other values (186)88664.1%
 
ValueCountFrequency (%) 
14780.6%
 
147.620.1%
 
14840.3%
 
148.520.1%
 
148.620.1%
 
ValueCountFrequency (%) 
18410.1%
 
183.610.1%
 
183.510.1%
 
183.210.1%
 
183.110.1%
 

체중
Real number (ℝ≥0)

Distinct238
Distinct (%)17.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean65.06637744
Minimum41.8
Maximum115
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum41.8
5-th percentile46
Q154
median65.5
Q374
95-th percentile86
Maximum115
Range73.2
Interquartile range (IQR)20

Descriptive statistics

Standard deviation13.14459347
Coefficient of variation (CV)0.2020182157
Kurtosis0.01701816154
Mean65.06637744
Median Absolute Deviation (MAD)10.5
Skewness0.4549887159
Sum89986.8
Variance172.7803376
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
70463.3%
 
48413.0%
 
72413.0%
 
55382.7%
 
75342.5%
 
51332.4%
 
69322.3%
 
56322.3%
 
54312.2%
 
77302.2%
 
Other values (228)102574.1%
 
ValueCountFrequency (%) 
41.820.1%
 
4220.1%
 
42.120.1%
 
42.720.1%
 
4380.6%
 
ValueCountFrequency (%) 
11510.1%
 
114.510.1%
 
114.210.1%
 
11310.1%
 
112.410.1%
 

허리둘레
Real number (ℝ≥0)

Distinct121
Distinct (%)8.8%
Missing2
Missing (%)0.1%
Infinite0
Infinite (%)0.0%
Mean78.53294714
Minimum57
Maximum118
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum57
5-th percentile63.8
Q171
median79
Q385
95-th percentile94
Maximum118
Range61
Interquartile range (IQR)14

Descriptive statistics

Standard deviation9.79959445
Coefficient of variation (CV)0.1247832255
Kurtosis0.4428207551
Mean78.53294714
Median Absolute Deviation (MAD)7
Skewness0.4210452011
Sum108454
Variance96.03205138
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
79695.0%
 
81564.0%
 
83533.8%
 
76533.8%
 
84533.8%
 
80523.8%
 
86483.5%
 
73483.5%
 
78473.4%
 
72473.4%
 
Other values (111)85561.8%
 
ValueCountFrequency (%) 
5720.1%
 
5940.3%
 
60120.9%
 
61110.8%
 
61.420.1%
 
ValueCountFrequency (%) 
11820.1%
 
117.210.1%
 
11710.1%
 
11610.1%
 
11320.1%
 

BMI
Real number (ℝ≥0)

Distinct158
Distinct (%)11.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.4232321
Minimum17.1
Maximum36.4
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum17.1
5-th percentile18.81
Q121.15
median23.1
Q325.4
95-th percentile29.09
Maximum36.4
Range19.3
Interquartile range (IQR)4.25

Descriptive statistics

Standard deviation3.222933542
Coefficient of variation (CV)0.1375955943
Kurtosis1.046419219
Mean23.4232321
Median Absolute Deviation (MAD)2.1
Skewness0.804409292
Sum32394.33
Variance10.38730062
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
22.2302.2%
 
20.9261.9%
 
22.6241.7%
 
20.4241.7%
 
20231.7%
 
22231.7%
 
22.5221.6%
 
21.6221.6%
 
21.5221.6%
 
23.1211.5%
 
Other values (148)114682.9%
 
ValueCountFrequency (%) 
17.110.1%
 
17.210.1%
 
17.310.1%
 
17.420.1%
 
17.620.1%
 
ValueCountFrequency (%) 
36.410.1%
 
36.320.1%
 
35.910.1%
 
35.210.1%
 
34.810.1%
 

시력(좌)
Real number (ℝ≥0)

Distinct13
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.046203905
Minimum0.1
Maximum2
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum0.1
5-th percentile0.5
Q10.9
median1
Q31.2
95-th percentile1.5
Maximum2
Range1.9
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.3079007519
Coefficient of variation (CV)0.2943028128
Kurtosis0.3156325796
Mean1.046203905
Median Absolute Deviation (MAD)0.2
Skewness0.01176742533
Sum1446.9
Variance0.09480287301
MonotocityNot monotonic
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
1.238127.5%
 
132323.4%
 
1.520815.0%
 
0.81279.2%
 
0.91128.1%
 
0.7735.3%
 
0.6634.6%
 
0.5392.8%
 
0.4231.7%
 
2151.1%
 
Other values (3)191.4%
 
ValueCountFrequency (%) 
0.120.1%
 
0.260.4%
 
0.3110.8%
 
0.4231.7%
 
0.5392.8%
 
ValueCountFrequency (%) 
2151.1%
 
1.520815.0%
 
1.238127.5%
 
132323.4%
 
0.91128.1%
 

시력(우)
Real number (ℝ≥0)

Distinct15
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.104121475
Minimum0.1
Maximum9.9
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum0.1
5-th percentile0.5
Q10.9
median1
Q31.2
95-th percentile1.5
Maximum9.9
Range9.8
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.8636938591
Coefficient of variation (CV)0.7822453223
Kurtosis84.63013322
Mean1.104121475
Median Absolute Deviation (MAD)0.2
Skewness8.602062416
Sum1527
Variance0.7459670822
MonotocityNot monotonic
Histogram with fixed size bins (bins=15)
ValueCountFrequency (%) 
134524.9%
 
1.233424.2%
 
1.521815.8%
 
0.914010.1%
 
0.8977.0%
 
0.7755.4%
 
0.5523.8%
 
0.6372.7%
 
0.4302.2%
 
0.3161.2%
 
Other values (5)392.8%
 
ValueCountFrequency (%) 
0.180.6%
 
0.2120.9%
 
0.3161.2%
 
0.4302.2%
 
0.5523.8%
 
ValueCountFrequency (%) 
9.9110.8%
 
710.1%
 
270.5%
 
1.521815.8%
 
1.233424.2%
 

청력(좌)
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size10.8 KiB
정상
1375 
비정상
 
8
ValueCountFrequency (%) 
정상137599.4%
 
비정상80.6%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length2
Mean length2.005784526
Min length2

청력(우)
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size10.8 KiB
정상
1370 
비정상
 
13
ValueCountFrequency (%) 
정상137099.1%
 
비정상130.9%
 
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
Histogram of lengths of the category

Length

Max length3
Median length2
Mean length2.009399855
Min length2

수축기혈압
Real number (ℝ≥0)

Distinct65
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean117.3217643
Minimum90
Maximum170
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum90
5-th percentile100
Q1110
median117
Q3126
95-th percentile138
Maximum170
Range80
Interquartile range (IQR)16

Descriptive statistics

Standard deviation12.34981602
Coefficient of variation (CV)0.1052644929
Kurtosis0.1610889863
Mean117.3217643
Median Absolute Deviation (MAD)8
Skewness0.429712944
Sum162256
Variance152.5179558
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11018513.4%
 
12014110.2%
 
130956.9%
 
100815.9%
 
118513.7%
 
115473.4%
 
119342.5%
 
112332.4%
 
114322.3%
 
125302.2%
 
Other values (55)65447.3%
 
ValueCountFrequency (%) 
90141.0%
 
9120.1%
 
9230.2%
 
9310.1%
 
9420.1%
 
ValueCountFrequency (%) 
17010.1%
 
16610.1%
 
16040.3%
 
15530.2%
 
15410.1%
 

이완기혈압
Real number (ℝ≥0)

Distinct50
Distinct (%)3.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.63991323
Minimum51
Maximum113
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum51
5-th percentile60
Q170
median75
Q380
95-th percentile89
Maximum113
Range62
Interquartile range (IQR)10

Descriptive statistics

Standard deviation9.202524224
Coefficient of variation (CV)0.1232922685
Kurtosis-0.115176544
Mean74.63991323
Median Absolute Deviation (MAD)5
Skewness0.1579530878
Sum103227
Variance84.68645209
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7021015.2%
 
8020014.5%
 
60997.2%
 
75654.7%
 
78584.2%
 
72564.0%
 
85372.7%
 
88332.4%
 
76332.4%
 
68322.3%
 
Other values (40)56040.5%
 
ValueCountFrequency (%) 
5120.1%
 
5320.1%
 
5460.4%
 
5520.1%
 
5660.4%
 
ValueCountFrequency (%) 
11310.1%
 
10110.1%
 
100120.9%
 
9920.1%
 
9840.3%
 

요단백
Unsupported

REJECTED
UNSUPPORTED

Missing4
Missing (%)0.3%
Memory size10.9 KiB

헤모글로빈
Real number (ℝ≥0)

Distinct74
Distinct (%)5.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean14.37816341
Minimum9.4
Maximum18.1
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum9.4
5-th percentile12
Q113.4
median14.4
Q315.5
95-th percentile16.6
Maximum18.1
Range8.7
Interquartile range (IQR)2.1

Descriptive statistics

Standard deviation1.436907692
Coefficient of variation (CV)0.09993680353
Kurtosis-0.3624412743
Mean14.37816341
Median Absolute Deviation (MAD)1.1
Skewness-0.1815333081
Sum19885
Variance2.064703716
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13.7453.3%
 
13.4443.2%
 
14.9443.2%
 
14.1413.0%
 
14.6382.7%
 
13.8382.7%
 
16.1362.6%
 
14.3362.6%
 
15.4362.6%
 
14.7362.6%
 
Other values (64)98971.5%
 
ValueCountFrequency (%) 
9.420.1%
 
10.220.1%
 
10.340.3%
 
10.620.1%
 
10.720.1%
 
ValueCountFrequency (%) 
18.110.1%
 
17.910.1%
 
17.810.1%
 
17.610.1%
 
17.510.1%
 

공복혈당
Real number (ℝ≥0)

Distinct77
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93.87418655
Minimum58
Maximum221
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum58
5-th percentile76
Q186
median93
Q3100
95-th percentile114
Maximum221
Range163
Interquartile range (IQR)14

Descriptive statistics

Standard deviation13.03640978
Coefficient of variation (CV)0.1388710812
Kurtosis14.45369584
Mean93.87418655
Median Absolute Deviation (MAD)7
Skewness2.052697324
Sum129828
Variance169.9479801
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
95624.5%
 
91554.0%
 
88503.6%
 
98493.5%
 
85483.5%
 
87483.5%
 
81463.3%
 
99463.3%
 
89463.3%
 
93463.3%
 
Other values (67)88764.1%
 
ValueCountFrequency (%) 
5810.1%
 
6210.1%
 
6520.1%
 
6610.1%
 
6810.1%
 
ValueCountFrequency (%) 
22110.1%
 
21510.1%
 
18110.1%
 
17410.1%
 
16010.1%
 

총콜레스테롤
Real number (ℝ≥0)

MISSING

Distinct166
Distinct (%)14.3%
Missing223
Missing (%)16.1%
Infinite0
Infinite (%)0.0%
Mean192.0732759
Minimum101
Maximum341
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum101
5-th percentile140
Q1168
median191
Q3213
95-th percentile249.05
Maximum341
Range240
Interquartile range (IQR)45

Descriptive statistics

Standard deviation34.92497532
Coefficient of variation (CV)0.181831518
Kurtosis0.783321211
Mean192.0732759
Median Absolute Deviation (MAD)23
Skewness0.4815262834
Sum222805
Variance1219.753901
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
181282.0%
 
191231.7%
 
185221.6%
 
193201.4%
 
192191.4%
 
194191.4%
 
206181.3%
 
190181.3%
 
170171.2%
 
159171.2%
 
Other values (156)95969.3%
 
(Missing)22316.1%
 
ValueCountFrequency (%) 
10110.1%
 
11010.1%
 
11210.1%
 
11330.2%
 
11910.1%
 
ValueCountFrequency (%) 
34110.1%
 
33510.1%
 
31710.1%
 
31320.1%
 
31020.1%
 

HDL
Real number (ℝ≥0)

MISSING
SKEWED

Distinct79
Distinct (%)6.8%
Missing225
Missing (%)16.3%
Infinite0
Infinite (%)0.0%
Mean59.3402418
Minimum21
Maximum2711
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum21
5-th percentile37
Q146
median55
Q366
95-th percentile84
Maximum2711
Range2690
Interquartile range (IQR)20

Descriptive statistics

Standard deviation79.33940714
Coefficient of variation (CV)1.337025343
Kurtosis1081.007982
Mean59.3402418
Median Absolute Deviation (MAD)10
Skewness32.32320688
Sum68716
Variance6294.741525
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
53433.1%
 
49423.0%
 
60413.0%
 
46372.7%
 
47352.5%
 
55352.5%
 
50342.5%
 
56332.4%
 
57312.2%
 
43312.2%
 
Other values (69)79657.6%
 
(Missing)22516.3%
 
ValueCountFrequency (%) 
2120.1%
 
2210.1%
 
2510.1%
 
2810.1%
 
2920.1%
 
ValueCountFrequency (%) 
271110.1%
 
11120.1%
 
10420.1%
 
10310.1%
 
10110.1%
 

중성지방
Real number (ℝ≥0)

MISSING

Distinct287
Distinct (%)24.8%
Missing225
Missing (%)16.3%
Infinite0
Infinite (%)0.0%
Mean130.8454231
Minimum23
Maximum1876
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum23
5-th percentile41
Q167
median99
Q3152.75
95-th percentile301
Maximum1876
Range1853
Interquartile range (IQR)85.75

Descriptive statistics

Standard deviation129.970535
Coefficient of variation (CV)0.9933135744
Kurtosis66.28401295
Mean130.8454231
Median Absolute Deviation (MAD)39
Skewness6.570088371
Sum151519
Variance16892.33996
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
51181.3%
 
52171.2%
 
86161.2%
 
84161.2%
 
54151.1%
 
96151.1%
 
90141.0%
 
46130.9%
 
81130.9%
 
49130.9%
 
Other values (277)100872.9%
 
(Missing)22516.3%
 
ValueCountFrequency (%) 
2320.1%
 
2420.1%
 
2820.1%
 
2920.1%
 
3020.1%
 
ValueCountFrequency (%) 
187610.1%
 
169110.1%
 
156110.1%
 
99930.2%
 
97810.1%
 

LDL
Real number (ℝ≥0)

MISSING

Distinct156
Distinct (%)13.7%
Missing247
Missing (%)17.9%
Infinite0
Infinite (%)0.0%
Mean113.1602113
Minimum26
Maximum1156
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum26
5-th percentile60.75
Q189
median111
Q3130
95-th percentile165
Maximum1156
Range1130
Interquartile range (IQR)41

Descriptive statistics

Standard deviation61.76846665
Coefficient of variation (CV)0.5458496936
Kurtosis197.2668211
Mean113.1602113
Median Absolute Deviation (MAD)21
Skewness12.05065725
Sum128550
Variance3815.343473
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
114241.7%
 
112221.6%
 
118221.6%
 
98221.6%
 
113211.5%
 
99211.5%
 
61191.4%
 
120191.4%
 
89191.4%
 
90171.2%
 
Other values (146)93067.2%
 
(Missing)24717.9%
 
ValueCountFrequency (%) 
2610.1%
 
2820.1%
 
4020.1%
 
4120.1%
 
4510.1%
 
ValueCountFrequency (%) 
115610.1%
 
112620.1%
 
25310.1%
 
23310.1%
 
22110.1%
 

혈청크레아티닌
Real number (ℝ≥0)

Distinct24
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9275488069
Minimum0.1
Maximum11
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum0.1
5-th percentile0.6
Q10.7
median0.9
Q31
95-th percentile1.2
Maximum11
Range10.9
Interquartile range (IQR)0.3

Descriptive statistics

Standard deviation0.7007130906
Coefficient of variation (CV)0.7554460589
Kurtosis120.8647559
Mean0.9275488069
Median Absolute Deviation (MAD)0.1
Skewness10.39858591
Sum1282.8
Variance0.4909988354
MonotocityNot monotonic
Histogram with fixed size bins (bins=24)
ValueCountFrequency (%) 
0.825918.7%
 
0.924817.9%
 
120915.1%
 
0.719013.7%
 
1.115911.5%
 
0.61349.7%
 
1.2805.8%
 
0.5473.4%
 
1.3171.2%
 
1.490.7%
 
Other values (14)312.2%
 
ValueCountFrequency (%) 
0.170.5%
 
0.330.2%
 
0.460.4%
 
0.5473.4%
 
0.61349.7%
 
ValueCountFrequency (%) 
1110.1%
 
10.110.1%
 
9.610.1%
 
910.1%
 
8.510.1%
 

신사구체여과율
Real number (ℝ≥0)

Distinct140
Distinct (%)10.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean102.1988431
Minimum26
Maximum1713
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum26
5-th percentile67
Q179.5
median91
Q3110
95-th percentile146.9
Maximum1713
Range1687
Interquartile range (IQR)30.5

Descriptive statistics

Standard deviation92.04211707
Coefficient of variation (CV)0.9006179942
Kurtosis200.4556767
Mean102.1988431
Median Absolute Deviation (MAD)14
Skewness13.46809676
Sum141341
Variance8471.751316
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
87433.1%
 
81423.0%
 
82423.0%
 
86382.7%
 
95362.6%
 
78302.2%
 
75302.2%
 
84302.2%
 
88292.1%
 
94292.1%
 
Other values (130)103474.8%
 
ValueCountFrequency (%) 
2620.1%
 
3520.1%
 
3910.1%
 
4020.1%
 
4410.1%
 
ValueCountFrequency (%) 
171310.1%
 
152710.1%
 
144110.1%
 
139020.1%
 
81720.1%
 

AST
Real number (ℝ≥0)

Distinct63
Distinct (%)4.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.58929863
Minimum9
Maximum122
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum9
5-th percentile14
Q118
median21
Q327
95-th percentile40
Maximum122
Range113
Interquartile range (IQR)9

Descriptive statistics

Standard deviation10.51746262
Coefficient of variation (CV)0.445857369
Kurtosis22.65950363
Mean23.58929863
Median Absolute Deviation (MAD)4
Skewness3.661401207
Sum32624
Variance110.61702
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
201087.8%
 
22987.1%
 
17966.9%
 
18936.7%
 
19926.7%
 
15795.7%
 
23735.3%
 
21654.7%
 
24654.7%
 
16624.5%
 
Other values (53)55239.9%
 
ValueCountFrequency (%) 
920.1%
 
1090.7%
 
1160.4%
 
12120.9%
 
13221.6%
 
ValueCountFrequency (%) 
12210.1%
 
11610.1%
 
11510.1%
 
10610.1%
 
9620.1%
 

ALT
Real number (ℝ≥0)

Distinct86
Distinct (%)6.3%
Missing10
Missing (%)0.7%
Infinite0
Infinite (%)0.0%
Mean23.97305171
Minimum6
Maximum396
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum6
5-th percentile9
Q114
median19
Q328
95-th percentile52
Maximum396
Range390
Interquartile range (IQR)14

Descriptive statistics

Standard deviation22.17492745
Coefficient of variation (CV)0.9249939355
Kurtosis131.0203062
Mean23.97305171
Median Absolute Deviation (MAD)6
Skewness9.12249408
Sum32915
Variance491.7274074
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
13775.6%
 
14745.4%
 
15705.1%
 
17664.8%
 
12614.4%
 
10604.3%
 
11574.1%
 
19574.1%
 
18554.0%
 
16543.9%
 
Other values (76)74253.7%
 
ValueCountFrequency (%) 
680.6%
 
7171.2%
 
8282.0%
 
9241.7%
 
10604.3%
 
ValueCountFrequency (%) 
39610.1%
 
36210.1%
 
34910.1%
 
12410.1%
 
12310.1%
 

GTP
Real number (ℝ≥0)

Distinct151
Distinct (%)10.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean36.20679682
Minimum2
Maximum999
Zeros0
Zeros (%)0.0%
Memory size10.8 KiB

Quantile statistics

Minimum2
5-th percentile9
Q114
median23
Q339
95-th percentile109.9
Maximum999
Range997
Interquartile range (IQR)25

Descriptive statistics

Standard deviation48.68901325
Coefficient of variation (CV)1.34474788
Kurtosis120.2866444
Mean36.20679682
Median Absolute Deviation (MAD)10
Skewness7.934466122
Sum50074
Variance2370.620012
MonotocityNot monotonic
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
11725.2%
 
15664.8%
 
12644.6%
 
14634.6%
 
13554.0%
 
20523.8%
 
16493.5%
 
17473.4%
 
18372.7%
 
24362.6%
 
Other values (141)84260.9%
 
ValueCountFrequency (%) 
220.1%
 
360.4%
 
480.6%
 
5120.9%
 
620.1%
 
ValueCountFrequency (%) 
99910.1%
 
46610.1%
 
35010.1%
 
29210.1%
 
28910.1%
 

폐결핵흉부질환
Unsupported

REJECTED
UNSUPPORTED

Missing0
Missing (%)0.0%
Memory size10.9 KiB

Interactions

Correlations

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

Sample

First rows

Unnamed: 0성별생년검진년도검진 시 연령체중허리둘레BMI시력(좌)시력(우)청력(좌)청력(우)수축기혈압이완기혈압요단백헤모글로빈공복혈당총콜레스테롤HDL중성지방LDL혈청크레아티닌신사구체여과율ASTALTGTP폐결핵흉부질환
00F1977200933152.047.067.020.30.80.5정상정상11372음성14.6102122.038.0119.060.00.7852020.024정상
11F1977201034152.045.066.019.50.60.4정상정상11175음성13.194120.040.086.062.00.81011615.023정상
22F1977201135152.046.060.019.90.10.3정상정상11575음성13.3100132.041.087.073.00.7821712.022정상
33F1977201236151.044.068.019.30.50.4정상정상11281음성14.6113141.048.072.079.00.8842017.017정상
44F1977201337152.044.065.019.00.60.5정상정상11070음성14.6112137.049.096.068.00.7100127.021정상
55F1977201438151.043.067.018.90.50.5정상정상12080음성14.9112121.040.0108.059.00.61122223.050정상
66F1977201539151.045.069.019.70.50.5정상정상12080음성14.8115168.050.092.099.00.61183139.050정상
77F1977201640151.044.068.019.30.40.4정상정상12080음성14.8118143.036.096.087.00.7992318.027정상
88F1977201741149.941.865.018.60.60.4정상정상12080음성14.9105149.050.046.089.00.7981816.026정상
99F1977201842150.943.666.019.10.60.3정상정상10070음성14.9123NaNNaNNaNNaN0.7981810.018정상

Last rows

Unnamed: 0성별생년검진년도검진 시 연령체중허리둘레BMI시력(좌)시력(우)청력(좌)청력(우)수축기혈압이완기혈압요단백헤모글로빈공복혈당총콜레스테롤HDL중성지방LDL혈청크레아티닌신사구체여과율ASTALTGTP폐결핵흉부질환
13731373M1972201140182.081.091.024.40.91.2정상정상13080음성17.2103205.048.0369.083.01.01293132.034정상
13741374M1972201241182.072.078.021.71.21.0정상정상12570음성15.892174.041.0183.0NaN1.1781918.023정상
13751375M1972201342182.072.084.021.71.20.9정상정상12570음성15.768201.049.095.0133.01.1782820.020정상
13761376M1972201443182.073.079.022.01.21.0정상정상13785음성16.792204.053.0108.0129.00.1822016.020정상
13771377M1972201544181.076.077.023.20.81.0정상정상13585음성16.4100214.058.0148.0126.01.1732217.020정상
13781378M1972201645182.072.080.021.71.01.0정상정상12474음성15.394208.060.0186.0110.01.0862021.019정상
13791379M1972201746182.075.084.022.61.01.0정상정상13580음성16.990191.075.0288.058.00.9973024.023정상
13801380M1972201847182.276.379.023.01.51.0정상정상11070음성15.799NaNNaNNaNNaN1.0822328.022정상
13811381M1972201948183.574.584.522.10.81.0정상정상13892음성16.5104NaNNaNNaNNaN1.0862224.02313
13821382M1972202049181.276.085.023.11.21.0정상정상13279음성17.4102221.046.0141.0147.00.81102625.028정상